Skip to content

fix: updated mlx related pkgs + end condition for stopping model request loop#111

Merged
madclaws merged 2 commits intomainfrom
fix/qwen3.5-load-issue
Mar 29, 2026
Merged

fix: updated mlx related pkgs + end condition for stopping model request loop#111
madclaws merged 2 commits intomainfrom
fix/qwen3.5-load-issue

Conversation

@madclaws
Copy link
Copy Markdown
Member

No description provided.

@madclaws madclaws linked an issue Mar 28, 2026 that may be closed by this pull request
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 28, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: ec12eb76-d604-41b7-b301-fae97a935e7d

📥 Commits

Reviewing files that changed from the base of the PR and between 8f68bfd and c4f58ec.

⛔ Files ignored due to path filters (1)
  • server/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (14)
  • modelfiles/qwen
  • pkg/scripts/preinstall
  • server/api.py
  • server/backend/mlx.py
  • server/hf_downloader.py
  • server/pyproject.toml
  • server/stack/requirements/app-server/packages-app-server.txt
  • server/stack/requirements/app-server/pylock.app-server.meta.json
  • server/stack/requirements/app-server/pylock.app-server.toml
  • server/stack/requirements/app-server/requirements-app-server.in
  • server/stack/venvstacks.toml
  • server/throttled_download_worker.py
  • tiles/src/main.rs
  • tiles/src/runtime/mlx.rs

📝 Walkthrough

Walkthrough

This PR updates many pinned Python dependencies and lockfiles (notably bumping mlx, mlx-lm, transformers, replacing requests/urllib3 with httpcore/httpx, and adding several tokenizer/CLI packages), adds a preinstall script, and changes container/model files (qwen base image). It removes the explicit model download flow: deletes server/hf_downloader.py, server/throttled_download_worker.py, the /download POST endpoint, and a download helper in server/backend/mlx.py; model provisioning is now performed implicitly via existing model-loading calls. A small runtime change adjusts reply extraction behavior in the MLX Rust runtime and replaces logger initialization in main.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant ServerAPI as Server API
participant Backend as MLX Backend
participant ModelCache as Model Cache
participant HF as HuggingFace (external)
Client->>ServerAPI: request model download (old flow)
ServerAPI->>Backend: pull_model request
Backend->>HF: snapshot_download (throttled worker)
HF-->>Backend: model artifacts
Backend->>ModelCache: write model files
Backend-->>ServerAPI: success/failure
ServerAPI-->>Client: response

mermaid
sequenceDiagram
participant Client
participant ServerAPI as Server API
participant Backend as MLX Backend
participant ModelCache as Model Cache
Client->>ServerAPI: request inference/start
ServerAPI->>Backend: get_or_load_model(model_spec)
Backend->>ModelCache: check cache
alt cached
ModelCache-->>Backend: model path
else not cached
Backend->>ModelCache: attempt implicit load (load_model)
ModelCache-->>Backend: model path / error
end
Backend-->>ServerAPI: model ready
ServerAPI-->>Client: inference response

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Bundle models for the offline installer #101: Changes model provisioning to use model-cache-aware loading and adds get_or_load_model/get_model_cache paths — closely related to removal of explicit downloader and implicit loading here.
  • v0.3.0 + Linux compatibility #30: Modifies server/backend and API model-loading code; overlaps with this PR's removal of download helpers and endpoint.
  • pkg enhancements #97: Alters model provisioning flow and Rust helpers for model-cache/download; directly related to runtime/cache behavior changed in this PR.
🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Description check ⚠️ Warning No pull request description was provided by the author, making it impossible to assess whether the description is related to the changeset. Add a pull request description that explains the purpose of the MLX package updates and the end condition logic change in tiles/src/runtime/mlx.rs.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (1 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the two main changes: MLX package updates and an addition to handle the end condition for stopping model request loops in the Rust code.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/qwen3.5-load-issue

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@server/stack/requirements/app-server/pylock.app-server.toml`:
- Around line 233-245: The call to snapshot_download is receiving the removed
parameter "local_dir_use_symlinks" (it's set to False at the earlier assignment
and ends up in kwargs_dict passed to snapshot_download), which will raise a
TypeError with huggingface-hub 1.8.0; remove that key from kwargs_dict before
calling snapshot_download (e.g., pop/remove "local_dir_use_symlinks" from
kwargs_dict in the fallback/throttled-worker download path so the
snapshot_download call no longer receives the unexpected keyword).

In `@server/stack/venvstacks.toml`:
- Line 29: The exception handlers around snapshot_download in
server/hf_downloader.py (the except block catching
requests.exceptions.HTTPError/ConnectionError/Timeout at lines ~120-127) must be
updated to handle the newer huggingface-hub v1.8.0 behavior: replace or extend
the caught exceptions to include httpx.HTTPError and the Hugging Face
hub-specific HTTP error (HfHubHttpError / HfHubHTTPError) and add the
corresponding imports, while keeping any existing requests-based catches as a
fallback so both old and new backends are supported; update the except clause(s)
to catch (httpx.HTTPError, HfHubHttpError) alongside the existing
requests.exceptions.*.

In `@tiles/src/runtime/mlx.rs`:
- Line 553: Remove the unconditional debug print that uses println! of the
variable content (the line containing println!("{}", content)); either delete
that line or replace it with a conditional debug log (e.g., tracing::debug! or
log::debug!) behind a feature flag so the full content is not printed to stdout
in production. Locate the println! call in tiles/src/runtime/mlx.rs where
content is printed and switch to a non-stdout, conditional logging mechanism or
remove it entirely.
- Around line 557-559: The sentinel "END" return in extract_reply (used in
mlx.rs) is causing downstream logic to treat it as a real assistant message
(affecting g_reply emptiness checks and causing "END" to be stored in
conversations/DB); change extract_reply to return Option<String> (None for the
loop-termination case) or otherwise stop returning the literal "END", and update
convert_to_chat_response and all callers (the code that checks g_reply, the
conversation push logic around conversations and DB persistence) to explicitly
handle None as "terminate/skip storing" while preserving empty-string replies as
valid responses so the existing g_reply.is_empty() check still works. Ensure
references: extract_reply, convert_to_chat_response, g_reply, conversations, and
the DB persist path are updated to treat the termination signal separately from
a real reply.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 9716b2ef-9554-478b-869d-7f91a9ee317a

📥 Commits

Reviewing files that changed from the base of the PR and between d874af2 and 8f68bfd.

⛔ Files ignored due to path filters (1)
  • server/uv.lock is excluded by !**/*.lock
📒 Files selected for processing (11)
  • server/pyproject.toml
  • server/stack/requirements/app-server/packages-app-server.txt
  • server/stack/requirements/app-server/pylock.app-server.meta.json
  • server/stack/requirements/app-server/pylock.app-server.toml
  • server/stack/requirements/app-server/requirements-app-server.in
  • server/stack/requirements/framework-mlx/packages-framework-mlx.txt
  • server/stack/requirements/framework-mlx/pylock.framework-mlx.meta.json
  • server/stack/requirements/framework-mlx/pylock.framework-mlx.toml
  • server/stack/requirements/framework-mlx/requirements-framework-mlx.in
  • server/stack/venvstacks.toml
  • tiles/src/runtime/mlx.rs

Comment on lines +14 to +16
httpcore==1.0.9
httpx==0.28.1
huggingface-hub==1.8.0
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -n -C2 'import requests|requests\.exceptions|requests\.(get|post)\s*=|snapshot_download\s*\(|HfHubHttpError|set_(async_)?client_factory' server/hf_downloader.py server/throttled_download_worker.py

Repository: tilesprivacy/tiles

Length of output: 2917


🏁 Script executed:

cat -n server/stack/requirements/app-server/packages-app-server.txt

Repository: tilesprivacy/tiles

Length of output: 1515


🏁 Script executed:

# Check if requests appears anywhere in the requirements file
rg -n "requests" server/stack/requirements/app-server/packages-app-server.txt

Repository: tilesprivacy/tiles

Length of output: 44


🌐 Web query:

huggingface-hub 1.8.0 dependencies requests httpx

💡 Result:

huggingface-hub 1.8.0 requires httpx as a core dependency (>=0.23.0, <1), but does not require requests. Since v1.0, the library has migrated from requests to httpx as its HTTP backend, dropping requests as a dependency. httpx is listed under core dependencies on PyPI, while requests is absent. It also appears as an optional dependency in some extras (e.g., oauth), but not required for base installation.

Citations:


Update downloader code to use huggingface-hub v1's httpx-based HTTP backend instead of relying on requests.

requests is no longer in the requirements, but server/throttled_download_worker.py imports it (line 29), monkeypatches requests.get/post (lines 99–100), and catches requests.exceptions.* (lines 117, 132). Similarly, server/hf_downloader.py imports requests in its fallback path (line 116) and catches requests.exceptions.* (lines 122, 125, 128). Hub v1.8.0 migrated from requests to httpx and no longer includes requests as a dependency. The current monkeypatching approach will fail, and the custom throttling will not affect Hub traffic. Use set_client_factory() or set_async_client_factory() to hook into Hub's HTTP client instead.

Comment thread server/stack/requirements/app-server/pylock.app-server.toml
Comment thread server/stack/venvstacks.toml Outdated
Comment thread tiles/src/runtime/mlx.rs Outdated
Comment thread tiles/src/runtime/mlx.rs Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 28, 2026

Codecov Report

❌ Patch coverage is 0% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
tiles/src/main.rs 0.00% 3 Missing ⚠️
tiles/src/runtime/mlx.rs 0.00% 2 Missing ⚠️

📢 Thoughts on this report? Let us know!

- pre-install script to remove prev server folder by the installer
- removed iroh error logs showing in release build
- removed HF deps as downloader is done via RUST
@madclaws madclaws merged commit fdc886c into main Mar 29, 2026
2 of 3 checks passed
@madclaws madclaws deleted the fix/qwen3.5-load-issue branch March 29, 2026 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: Qwen 3.5 models not running

1 participant